Investigating Machine Learning and Control Theory Approaches for Process Fault Detection: A Comparative Study of KPCA and the Observer-Based Method

The paper focuses on the importance of prompt and efficient process fault detection in contemporary manufacturing industries, where product quality and safety protocols are critical. The study compares the efficiencies of two techniques for process fault detection: Kernel Principal Component Analysis (KPCA) and the observer method. Both techniques are applied to observe water volume variation within a hydraulic system comprising three tanks. PCA is an unsupervised learning technique used for dimensionality reduction and pattern recognition. It is an extension of Principal Component Analysis (PCA) that utilizes kernel functions to transform data into higher-dimensional spaces, where it becomes easier to separate classes or identify patterns. In this paper, KPCA is applied to detect faults in the hydraulic system by analyzing the variation in water volume. The observer method originates from control theory and is utilized to estimate the internal states of a system based on its output measurements. It is commonly used in control systems to estimate the unmeasurable or hidden states of a system, which is crucial for ensuring proper control and fault detection. In this study, the observer method is applied to the hydraulic system to estimate the water volume variations within the three tanks. The paper presents a comparative study of these two techniques applied to the hydraulic system. The results show that both KPCA and the observer method perform similarly in detecting faults within the system. This similarity in performance highlights the efficacy of these techniques and their potential adaptability in various fault diagnosis scenarios within modern manufacturing processes.


Introduction
The contemporary manufacturing landscape necessitates heightened product quality and safety operational practices. To maintain optimal system functionality and reduce downtime in case of failure, the early detection of process faults is critical [1]. Consequently, several process monitoring-based Multivariate Statistical Process (MSP) methods were developed thanks to their efficiencies and simplicity [2,3].
The Kernel Principal Component Analysis (KPCA) method, a simple yet interesting technique developed by Ratch et al. [4], is designed to accurately model nonlinear relationships inherent in process data. Utilizing the principle of kernel tricks [5], KPCA can efficiently project the input data with linearly inseparable structures onto a higher dimensional feature space in which the data become linearly separable, facilitating the The kernel function in KPCA allows for implicit computations in the high-dimensional space without explicitly transforming the data. Commonly used kernel functions include the Gaussian (or radial basis function), polynomial, and sigmoid kernels. These kernels measure the similarity or dissimilarity between data points, enabling the extraction of nonlinear features that would be challenging to capture using linear techniques.
The KPCA algorithm involves three main steps: (1) computation of the kernel matrix, which stores the pairwise similarities between data points based on the chosen kernel function, (2) agenda composition of the kernel matrix to obtain the principal components, and (3) projection of the data onto the principal components to obtain the transformed features.
KPCA finds applications in various fields, including computer vision, pattern recognition, bioinformatics, and signal processing. It enables the detection of nonlinear patterns, clustering of complex data, and nonlinear dimensionality reduction. By leveraging the power of nonlinear mapping, KPCA offers a valuable tool for analyzing and extracting meaningful information from high-dimensional and nonlinear datasets for education and analysis [23,24]. It consists of transforming the nonlinear aspects of input data space E into linear ones within a newly high-dimensional feature space, denoted H, and to perform PCA in that space. The feature space H is nonlinearly transformed from the input space E with a non-linear mapping function φ. The mapping of sample x ∈ E in the feature space H can be written as: Let us consider X = [x 1 , . . . , x i , . . . , x N ] T the training data matrix scaled to zero mean and unit variance. Where x i ∈ E ⊂ R m is a data vector, N is the number of observation samples and m is the number of process variables.
The monitoring phase based on the linear PCA approach requires the selection of principal components that maximize the variance in the data set. This is accomplished using the eigen decomposition of the covariance matrix. Similarly, this approach was generalized in the Kernel PCA approach by Ratsch [4]. The covariance matrix C Φ in the feature space H is given by: T ∈ R N×h define the data matrix in the feature space H, then C Φ can be expressed as: The principal components of the mapped data φ( are computed by solving the eigenvalue decomposition of C Φ , such that: With µ j being the jth eigenvector and λ j the associated jth eigenvalue. For λ j = 0, there exist coefficients α i,j ; i = 1 . . . N, such that all eigenvectors µ j can be considered as a linear combination of [φ(x 1 ) φ(x 2 ) . . . φ(x N )] and can be expressed by: However, in practice, the mapping function φ is not defined and then the covariance matrix C Φ in the feature space cannot be calculated implicitly. Thus, instead of solving the eigenvalue problem directly on C Φ , we apply the kernel trick firstly used for Support Vector Machine (SVM) [25]. The inner product given in Equation (2) may be calculated by a kernel function k that satisfies Mercer's theorem [12] as follows: Let us define a kernel matrix K associated with a kernel function k as: Applying the kernel matrix may reduce the problem of the eigenvalue decomposition of C Φ [26]. Hence, eigendecomposition of the kernel matrix K is equivalent to performing PCA in R H , so that: where Λ is the diagonal matrix of eigenvalues λ j arranged in descending order and V is the matrix of their corresponding eigenvectors.
Since the principal components are orthonormal, it is required to guarantee the normality of µ j in Equation (4), such that: n is the number of the first non-zero eigenvalues.

Number of Principal Components
Determining the number of retained principal components ( ) is an important step of modeling based on KPCA. The Cumulative Percent Variance (CPV) has been proposed to compute the retained PC ( ) [27,28]. The cumulative percent variance (CPV) is the sum of the first eigenvalues divided by their total variations. It can be expressed as: The number of retained PCs is chosen if the CPV is higher than 95%.

Fault Detection
Like in the PCA approach, the squared prediction error (SPE) is usually used for fault detection using KPCA [29,30]. However, the conventional KPCA does not provide any approach to data reconstruction in the feature space. Thus, the computation SPE index is difficult in the KPCA method. Kim [31] and Lahdhiri [32] proposed a simple expression to calculate SPE in the feature space H, which is shown as follows: whereP = α 1 , . . . , α is the matrix of the first principal eigenvectors of K, Λ = diag λ 1 , . . . , λ is the diagonal matrix of the first eigenvalues of K [33], and Sensors 2023, 23, 6899 5 of 16 The confidence limit for the SPE index can be calculated using the χ 2 distribution and is given by: where δ 2 α is the control limit expressed by: with: g = b 2a and h = 2a 2 b , where a is the estimated mean and b is the variance of the SPE [34,35].

The Observer Method
In fault diagnosis, an observer is a mathematical model or algorithm used to estimate the state variables and fault parameters of a system based on available measurements. The structure of an observer varies depending on the type of system being observed and the nature of the faults being diagnosed. However, the general structure of an observer typically involves the following components: • System Model: The observer relies on a mathematical model that describes the dynamics of the system being observed. This model can be derived from first principles or obtained through system identification techniques.
The observer uses a measurement equation that relates the system's state variables to the available measurements. This equation can be derived from the system model and typically includes sensor equations and/or sensor noise models. • State Estimation: The core of the observer is the state estimation algorithm, which updates and estimates the system's state variables based on the available measurements. Various estimation techniques can be used, such as Kalman filters, extended Kalman filters, particle filters, or model-based observers like the sliding mode observer.

•
Fault Detection: In fault diagnosis, the observer is also responsible for detecting the occurrence of faults. This can be done by comparing the estimated state variables with expected values or by analyzing the residuals between the measurements and the estimated values. • Fault Parameter Estimation: If faults are detected, the observer may also estimate the fault parameters, such as fault magnitudes, locations, or characteristics. This is typically done by incorporating fault models into the observer and updating the estimated parameters based on the available measurements and fault detection results.

•
Adaptation and Learning: Depending on the observer's design, it may incorporate adaptation or learning mechanisms to improve its performance over time. These mechanisms allow the observer to adapt to changes in system dynamics or fault characteristics, or to learn from historical data to enhance its fault diagnosis capabilities.
It is important to note that the specific structure and algorithm used for fault diagnosis observers can vary greatly depending on the application, system complexity, and available information. Different domains, such as power systems, automotive, or aerospace, may have specialized observer designs tailored to their specific requirements. A system with p inputs denoted u(t) and m output measurements denoted x(t).
The dynamic behavior of this system is described by the following equations [36]: .
where x(t) ∈ R n is the state vector, Ref. [37] is the output vector, and u(t) ∈ R r is the input vector.
Note that matrices A, B, and C represent the state-space description of a linear timeinvariant system [38,39], and they have appropriate dimensions with those of the vectors x(t), u(t), and y(t).
Given that the state is not generally available [40], the objective is an observer in order to perform a feedback control condition and estimate this state by a variable which we denote asx(t). This estimate is carried out by a dynamic system, the output will be preciselyx(t), and the input will consist of all the information available [41], that is to say, u(t) and y(t). The structure of an observer can be written as: where the correction term appears clearly in terms of the reconstruction error of the output y(t) −ŷ(t), and the correct term can be written as an ion gain [42], L, or the determined observer gain [43]. This structure can be written as: If we consider the estimation error: The observer described by Equation (15) is illustrated in Figure 1. precisely ( ), and the input will consist of all the information available [41], that is to say, u(t) and y(t). The structure of an observer can be written as: where the correction term appears clearly in terms of the reconstruction error of the output ( ) − ( ), and the correct term can be written as an ion gain [42], L, or the determined observer gain [43]. This structure can be written as: If we consider the estimation error: ( ) = ( ) − ( ) , we obtain: ( ) = ( − ) ( ) The observer described by Equation (15) is illustrated in Figure 1

Fault Detection Observer
In this work, we assume that the actuators and sensors are affected by faults. Our goal is to detect and isolate the faults [43]. The state of the system model can be written as: where ( ) is the actuator fault of the actuators and ( ) is a sensor fault.
where ( ) ∈ is an estimated state vector, is a matrix such that A − KC is stable and its proper values have a real part smaller than . This leads to: If is such that − is a Hurwitz matrix, the residue ( ) tends to 0 well in the absence of defects. Transfer between faults and residuals can be written: where p is a temporal derivative operator Which leads to [44], taking into account the inversion lemma in:

Fault Detection Observer
In this work, we assume that the actuators and sensors are affected by faults. Our goal is to detect and isolate the faults [43]. The state of the system model can be written as: .
where f a (t) is the actuator fault of the actuators and f c (t) is a sensor fault.
wherex(t) ∈ R n is an estimated state vector, K is a matrix such that A − KC is stable and its proper values have a real part smaller than A. This leads to: . ∼ If K is such that A − KC is a Hurwitz matrix, the residue ∼ y(t) tends to 0 well in the absence of defects. Transfer between faults and residuals can be written: where p is a temporal derivative operator. Which leads to [44], taking into account the inversion lemma in: which can be written as: From this relationship and in the absence of actuator failure, the system for isolating fault sensors from residues can be written as: .
This variable is estimated using the same system as before, that is to say: .
And the estimation of sensor failure is given by the inversion of the initial model of the system: where is not an estimate of faults but rather filtering defects; however, the character of diagonal ∆(p) allows the isolation [20].

Overview of Three-Tank System Applications
A three-tank water system, also referred to as a triple-tank or multi-tank system, exhibits a water storage and distribution network, using multiple tanks for various functions. Specific uses for three-tank water systems vary, but typically fall within environmental engineering, including Rainwater Harvesting and Reuse systems, greywater recycling, and off-grid water supply [46][47][48]. These applications have found their place in the heart of targeting sustainable practices, more specifically SDG6, by ensuring Clean Water and Sanitation [46]. A three-tank system can be employed for rainwater harvesting and reuse, with each tank serving its own specific function. One tank could collect rainwater from the roof, another could be dedicated to filtering and treating (primary, secondary, and tertiary water treatment processes), while a third could store treated rainwater for urban reuse in fields such as irrigation. This system helps decrease the reliance on freshwater sources while supporting sustainable water management practices [46].
Triple tank systems provide an effective method of greywater recycling via treating and reusing urban residual water, generated from domestic activities like bathing, handwashing, and laundry. One tank collects urban water for disinfection processes (mostly chlorine and combined chlorine sub-products) [49], while a second tank stores treated water for reuse, in activities such as toilet flushing or landscape irrigation. This approach helps release the stress on freshwater resources while alleviating strain on sewerage systems [47]. In remote or off-grid locations, the availability of reliable water resources is limited. Hence, a three-tank water system can serve as a reliable self-sufficient water supply source. One tank could hold water collected from natural sources like wells or springs, while another would treat and purify it before the third would store the treated water for domestic consumption [48].
On the other hand, several shortcomings and limitations may arise from the installation of three-tank systems. Following the sub-division of the system into three counterparts, the urge for complexity and space requirements are more demanding [46][47][48]. Additional space must also be implemented, as well as the multiple features required to connect the tanks (pumps, valves, etc.). In addition, this system exhibits low cost-effectiveness, as it involves high installation prices compared to the simple water storage system [50]. The maintenance and additional unit operations, required for water screening, treatment, and distribution, will incur additional expenses [51]. Following the complexity of the system, more design heuristics and trade-offs should be considered. This will make the maintenance operation more tedious and time-consuming [51]. On the human factor and engineering intuition level, a high level of skill and knowledge of handling are required, for the sake of preserving the integrity of the system [46][47][48].
Most of the aforementioned factors could be overcome by implementing a proper control system, based on the fact that multiple error sources could be generated. KPCA exhibits a highly convenient unsupervised machine learning approach for a three-tank system control, as it involves dealing with intercorrelated data input [52]. In other words, the error in one part of the system will definitely influence the other components.

Process Description of a Hydraulic System with Three Tanks
As illustrated in Figure 2, the considered process is a three-tank system with two inputs and three outputs. It consists of three tanks with identical sections, supplied with distilled water. They are serially interconnected by two cylindrical pipes with identical sections [53,54]. The pipes of communication between tanks T 1 and T 2 are equipped with manually adjustable valves; the flow rates of the connection pipes can be controlled using ball valves a z1 and a z2 . The plant has one outlet pipe located at the bottom of tank T 3 . There are three other pipes each installed at the bottom of each tank, which are provided with a direct connection (outflow rate) to the reservoir with ball valves b z1 , b z2 , and b z3 , respectively, The pipes can only be manipulated manually [12]. Pumps 1 and 2 are supplied by water from the reservoir with flow rates Q 1 (t) and Q 2 (t), respectively. The necessary level measurements h 1 (t), h 2 (t), and h 3 (t) are carried out by the piezo-resistive differential pressure sensors.
The state equations are obtained by writing that the variation of the water volume in a tank is equal to the difference between the incoming flow and the outgoing flows which means the water of tanks 1 and 2 can flow toward tank 3. Then, the system can be represented by the following equations: .
where Q in i (t) is the flow through pump i (i = 1; 2), and Q out1 ij (t) represents the flow rates of water between tanks i and j (i, j = 1, 2, 3∀i = j), and can be expressed using the law of Torricelli [29].
and Q out2 ij (t) represents the outflow rate, given by: where hi(t), Q in i (t), and Q out ij (t) are, respectively, the levels of water, the input flow rate, and the output flow rate.
tively, The pipes can only be manipulated manually [12]. Pumps 1 and 2 are supplied by water from the reservoir with flow rates Q1(t) and Q2(t), respectively. The necessary level measurements h1(t), h2(t), and h3(t) are carried out by the piezo-resistive differential pressure sensors.
The state equations are obtained by writing that the variation of the water volume in a tank is equal to the difference between the incoming flow and the outgoing flows which means the water of tanks 1 and 2 can flow toward tank 3.
where ( ) is the flow through pump i (i = 1; 2), and ( ) represents the flow rates of water between tanks i and j ( , = 1,2,3 ∀ ≠ ), and can be expressed using the law of Torricelli [29]. The parameters of the three-tank system are defined as follows. The controlled signals are the water levels (h 2 , h 3 ) of tank 2 and tank 3. These levels are controlled by two pumps. The system can be considered as a multi-input multi-output system (MIMO) [54], where the input is inflow rates Q 1 and Q 2 and the output is liquid levels h 2 and h 3 . Then the three-tank system can be modeled by the following three differential equations: where the parameters c i , i = 1, 3 and B j , j = 1, 2, 3, 4 are defined by: While taking B 1 = B 2 = B 3 = 0, the three equations of the system become: At equilibrium, for a constant water level set point, the level derivatives must be zero. .
Therefore, using (31) in the steady state, the following algebraic relationship holds. For the coupled-tank system, the fluid flow Q 1 into tank 1 cannot be negative because the pump can only drive water into the tank, then: From (36), we have: and Therefore, if we assume a .
Which can be written as: .

Simulation Results
We are interested in the fault detection of the pressure sensors, which measure the water levels (h 2 , h 3 ) of tank 2 and tank 3 using conventional KPCA and the observer method. A total of 5000 samples were generated from this process [55]. The 1000 first samples were used to construct the KPCA model; the last 3000 samples are used to test the fault detection methods. We have used the Radial Basis Function (RBF) [56,57]. Two types of faults are considered [58]: faults in the pressure sensor of tank 2 and faults in the pressure sensor of tank 3.
Case 1: faults in the pressure sensor of tank 2 (water level h 2 ). Fault 1: a step bias of h 2 by adding 10% more than its range of variation [59]. The fault is introduced between samples 1500 and 3000.
The SPE index is a statistical measure commonly used in multivariate analysis or process monitoring. It quantifies the discrepancy between predicted and observed values in a model. In this context, Figure 3 represents the results of an analysis or experiment involving a fault (Fault 1) and the SPE (Squared Prediction Error) index, with the variable h 2 being affected. The evolution indicates that the figure shows changes or trends over time or some other continuous parameter. It suggests that the data or analysis captured a dynamic process or progression.
The SPE index is a statistical measure commonly used in multivariate analysis or process monitoring. It quantifies the discrepancy between predicted and observed values in a model. In this context, Figure 3 represents the results of an analysis or experiment involving a fault (Fault 1) and the SPE (Squared Prediction Error) index, with the variable h2 being affected. The evolution indicates that the figure shows changes or trends over time or some other continuous parameter. It suggests that the data or analysis captured a dynamic process or progression. Fault 2: a step bias of h 2 by adding 20% more than its range of variation. The fault is introduced between samples 3500 and 4500.
Case 2: Faults in the pressure sensor of tank 3 (water level h 3 ) Fault 1: a step bias of h 3 by adding 10% more than its range of variation. The fault is introduced between samples 2000 and 3000 [60].
Fault 2: a step bias of h 3 by adding 20% more than its range of variation. The fault is introduced between samples 3000 and 4000.
Based on the observations from Figures 4-10, it is evident that both the kernel method and the observer technique detect a fault occurrence, as indicated by the defective outputs h 2 and h 3 . Both methods yield effective and comparable results in terms of sensor fault detection. In conclusion, the simulation results using the "observer-based model and a kernel technique called KPCA" show that these two techniques are comparable in the field of fault diagnosis. These methods have been evaluated and demonstrated similar performances in terms of fault identification and detection. This suggests that the use of an observer-based model and the KPCA technique can be effective in diagnosing faults in a system. However, it is important to note that the comparison of performance between these techniques may depend on the specific context of the application and the characteristics of the system being studied. Further studies and experimental tests may be necessary to confirm these results and assess their applicability in other domains. Fault 1: a step bias of h3 by adding 10% more than its range of variation. The fault is introduced between samples 2000 and 3000 [60].
Fault 2: a step bias of h3 by adding 20% more than its range of variation. The fault is introduced between samples 3000 and 4000.
Based on the observations from Figures 4-10, it is evident that both the kernel method and the observer technique detect a fault occurrence, as indicated by the defective outputs h2 and h3. Both methods yield effective and comparable results in terms of sensor fault detection. In conclusion, the simulation results using the "observer-based model and a kernel technique called KPCA" show that these two techniques are comparable in the field of fault diagnosis. These methods have been evaluated and demonstrated similar performances in terms of fault identification and detection. This suggests that the use of an observer-based model and the KPCA technique can be effective in diagnosing faults in a system. However, it is important to note that the comparison of performance between these techniques may depend on the specific context of the application and the characteristics of the system being studied. Further studies and experimental tests may be necessary to confirm these results and assess their applicability in other domains.

Conclusions
This paper offers a comprehensive examination of an observer-based model and a kernel technique called KPCA, both employed for sensor fault detection [61]. The observer's operational principle is thoroughly explained, providing a detailed understanding of its function. To assess the effectiveness of these two techniques, a comparative analysis is conducted using a three-tank process. The simulation results yield valuable insights, demonstrating that both the observer-based model and the KPCA method yield satisfactory results. However, it is important to note that these findings are based on a single case study and cannot be considered in isolation. To establish the broader applicability of these techniques across various systems and conditions, further investigation is necessary. The technique used to validate our system is the injection of faults and we observe the reaction of the system with modification of the time each time. The way of identifying faults is validated and can be used in industry, specifically in chemical systems.

Conclusions
This paper offers a comprehensive examination of an observer-based model and a kernel technique called KPCA, both employed for sensor fault detection [61]. The observer's operational principle is thoroughly explained, providing a detailed understanding of its function. To assess the effectiveness of these two techniques, a comparative analysis is conducted using a three-tank process. The simulation results yield valuable insights, demonstrating that both the observer-based model and the KPCA method yield satisfactory results. However, it is important to note that these findings are based on a single case study and cannot be considered in isolation. To establish the broader applicability of these techniques across various systems and conditions, further investigation is necessary. The technique used to validate our system is the injection of faults and we observe the reaction of the system with modification of the time each time. The way of identifying faults is validated and can be used in industry, specifically in chemical systems.
The problem is the difficulty of detecting defects. Our method has approved the capability of detecting defects despite the change in the nature of defects and the integration time.
Consequently, additional research is planned to advance the understanding of fault detection techniques through a deeper exploration of these approaches.
"Kernel Principal Component Analysis (KPCA) and observer-based approaches are two distinct methods commonly used in fault detection within the realms of machine learning and control theory. KPCA is a nonlinear dimensionality reduction technique that maps data into a high-dimensional feature space using kernel functions, enabling the detection of faults in complex and nonlinear systems. Its advantages lie in its ability to handle nonlinearity, making it suitable for intricate processes. However, KPCA's performance heavily relies on the appropriate choice of kernel and its associated parameters, which can be challenging to determine in practice. On the other hand, observer-based approaches leverage mathematical models to estimate the system's behavior and compare it with the actual response for detecting faults. The advantage of observer-based methods is their inherent robustness to disturbances and noise, allowing them to perform well in noisy environments. Nevertheless, these approaches often require a comprehensive and accurate model of the system, and their performance may suffer if the model is not precise or if the system exhibits significant nonlinear behavior. Overall, choosing between KPCA and an observer-based approach depends on the specific characteristics of the system, the availability of accurate models, and the level of nonlinearity present, as each method offers distinct strengths and weaknesses in fault detection applications".